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Abstract. A subset A of N is called an IP-set if A contains all finite sums of distinct terms of some 
infinite sequence (a;„)„ e N of natural numbers. Central sets, first introduced by Furstenberg using 
notions from topological dynamics, constitute a special class of IP-sets possessing rich combinatorial 
properties: Each central set contains arbitrarily long arithmetic progressions, and solutions to all 
partition regular systems of homogeneous linear equations. In this paper we investigate central sets 
in the framework of combinatorics on words. Using various families of uniformly recurrent words, 
including Sturmian words, the Thue-Morse word and fixed points of weak mixing substitutions, we 
generate an assortment of central sets which reflect the rich combinatorial structure of the underlying 
words. The results in this paper rely on interactions between different areas of mathematics, some 
of which had not previously been directly linked. They include the general theory of combinatorics 
on words, abstract numeration systems, and the beautiful theory, developed by Hindman, Strauss 
and others, linking IP-sets and central sets to the algebraic/topological properties of the Stone-Cech 
compactification of N. 

1. Introduction 

Let N = {0, 1, 2, 3, . . .} denote the set of natural numbers, and Fin(N) the set of all non-empty 
finite subsets of N. 

Definition 1.1. A subset AofN is called an IP-set if A contains {^2 neF x n | F G Fin(N)}for 
some infinite sequence of natural numbers xo < X\ < X2 ■ ■ ■ ■ A subset A C N i s called an IP* -set 
if 'An B ^ for every IP -set 5CE 



By a celebrated result of N. Hindman [21], given any finite partition of N, at least one element of 
the partition is an IP-set. It follows from Hindman's theorem that every IP*-set is an IP-set, but the 
converse is in general not true. In fact, more generally Hindman shows that given any finite parti- 
tion of an IP-set, at least one element of the partition is again an IP-set. In other words the property 
of being an IP-set is partition regular, i.e., cannot be destroyed via a finite partitioning. Other 
examples of partition regularity are given by the pigeonhole principle, sets having positive upper 
density, and sets having arbitrarily long arithmetic progressions (Van der Waerden's theorem). In 
[20], Furstenberg introduced a special class of IP-sets, called central sets, having a substantial 
combinatorial structure. The property of being central is also partition regular. Central sets were 
originally defined in terms of topological dynamics: 

Definition 1.2. A subset AcNis called central if there exists a compact metric space (X, d) and 
a continuous map T : X — > X, points x, y G X and a neighborhood U ofy such that 

• y is a uniformly recurrent point in X, 

• x and y are proximal, 
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• A = {n E N\T n (x) E U}. 
We say A C N is central* if A flB^ 0/or every central set B Cf}. 

Recall that x is said to be uniformly recurrent in X if for every neighborhood V of x the set 
{n | T n (x) E V} is syndetic, i.e., of bounded gap. Two points x, y E X are said to be proximal if 
for every e > there exists n E N such that d(T n (x),T n (y)) < e. We remark that from the above 
definition, it is not at all evident that central sets are IP-sets. We later give an alternative definition 
(see Definition 13 .41 ) which makes this point clear. The equivalence between the two definitions is 
due to Bergelson and Hindman Q. 

The question of determining whether a given subset A C N is an IP-set or a central set is typi- 
cally quite difficult, even if for every A, either A or its complement is an IP-set (resp. central set). 
It turns out that in each case this question may be reformulated in terms of whether or not the set A 
belongs to a certain class of ultrafilters on N (see Theorem 5.12 in 11241 in the case of IP-sets and [51 
in the case of central sets). But the question of belonging or not to a given (non-principal) ultrafilter 
is generally equally mysterious. An equivalent word combinatorial reformulation of this question 
is as follows: Given a binary word u = co u)iU)2 ■ ■ ■ E {0, 1}°°, put oj\ = {n E N | tu n = 0} 
and u)\ x = {n E N | co n = 1}. The question is then to determine whether the set u | or to \ x is an 
IP-set or central set. Of course in general, this reformulation is as difficult as the original question. 
However, should the word u be characterized by some rich combinatorial properties, or be gener- 
ated by some "simple" combinatorial or geometric algorithm (such as a substitution rule, a finite 
state automaton, a Toeplitz rule...) or arise as a natural coding of a reasonably simple symbolic 
dynamical system, then the underlying rigid combinatorial structure of the word may provide in- 
sight to our previous question. Furthermore, such families of words may be used to obtain simple 
constructions of central sets having additional nice properties inherited from the rich underlying 
combinatorial structure. One of our objectives here is to illustrate this latter point. 

Let A denote a finite non-empty set (called the alphabet) and to = to toitO2 ■ ■ ■ E A N . For each 
finite word u on the alphabet A we set 

u\ u = {n E N | uj n u n+ i . . . w n+H _i = u}. 

In other words, oj I denotes the set of all occurrences of u in to. 

In this paper we investigate partitions of N by sets of the form uo | u defined by a uniformly recurrent 
word co. Our goal is to study these partitions in the framework of IP-sets and central sets. We begin 
by showing that in this framework IP-sets and central sets are one and the same: 

Theorem 1. Let u E A N be uniformly recurrent. Then the set u\ is an IP-set if and only if it is a 
central set. 

This allows us to simultaneously state our results in terms of IP-sets and central sets. 

We begin by considering the simplest aperiodic infinite words, namely Sturmian words. Stur- 
mian words are infinite words over a binary alphabet having exactly n + l factors of length n for 
each n > 0. Their origin can be traced back to the astronomer J. Bernoulli III in 1772. A funda- 
mental result due to Morse and Hedlund [|29l states that each aperiodic (meaning non-ultimately 
periodic) infinite word must contain at least n + l factors of each length n > 0. Thus Sturmian 
words are those aperiodic words of lowest factor complexity. They arise naturally in many different 
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areas of mathematics including combinatorics, algebra, number theory, ergodic theory, dynamical 
systems and differential equations. Sturmian words are also of great importance in theoretical 
physics and in theoretical computer science and are used in computer graphics as digital approxi- 
mation of straight lines. 

The next two theorems give a complete characterization of those factors u of a Sturmian word 
u E {0, 1} N for which ui\ is an IP-set (respectively central set). First, a Sturmian word u is 
called singular if T n (u) = u for some n > 1, where T denotes the shift map and u denotes 
the characteristic Sturmian word in the shift orbit closure of u (see §2.21 for the definition of a 
characteristic Sturmian word). Otherwise it is said to be nonsingular. 

Theorem 2. Let oj e f2 be a nonsingular Sturmian word, and u a factor ofu. Then ui | is an IP-set 
(resp. central set) if and only ifu is a prefix ofu. Hence for every prefix v ofu and n £ u\ y the set 
u\ — n is an IP* -set (resp. central* set). 

Theorem 3. Let u G f2 be a Sturmian word such that T n °(u) = u with n > 1. Then cj| is an 
IP-set (or central set) if and only if either u is a prefix ofu or a prefix ofu' where u' is the unique 
other element ofQ with T n °(u') = u. 

Some (but not all) of the results on Sturmian partitions extend to the class of Arnoux-Rauzy words, 
which may be regarded as natural combinatorial extensions of Sturmian words to larger alphabets 

El. 

Using w-bonacci and the iterated palindromic closure operator, we construct infinite partitions of 
N into central sets having special translation invariant properties. 

We also consider partitions defined by words generated by substitution rules. For instance, by 
considering partitions of N defined by words generated by the generalized Thue-Morse substitution 
to an alphabet of size r > 2, we show that 

Theorem 4. For each pair of positive integers r and N there exists a partition of 

N = A x U A 2 U • • • U A r 

such that 

• At — n is a central set for each 1 < i < r and 1 < n < N. 

• For each n > N, exactly one of the sets {A x — n, A 2 — n, . . . , A r — n} is a central set. 

The second assertion of Theorem @]relies on the fact that each fixed point of the generalized Thue- 
Morse substitution is distal. 

By considering partitions defined by words generating minimal subshifts which are topologically 
weak mixing (for example the subshift generated by the substitution n> 001 and 1 h-> 11001) we 
prove that 

Theorem 5. For each positive integer r there exists a partition ofN — Ai U A 2 U • ■ ■ U A r such 
that for each 1 < i < r and n > 0, the set Ai — n is a central set. 

The results in this paper rely on various interactions between combinatorics on words, topo- 
logical dynamics and the algebraic and topological properties of the Stone-Cech compactification 
/3N. We regard f3N as the collection of all ultrafilters on N. An ultrafilter may be thought of as 
a {0, l}-valued finitely additive probability measure defined on all subsets of N. This notion of 
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measure induces a notion of convergence (p-lim n ) for sequences indexed by N, which we regard 
as a mapping p* from words to words. This key notion of convergence allows us to apply ideas 
from combinatorics on words in the framework of ultrafilters . 

Acknowledgements. The authors would like to thank V. Bergelson and Y. Son for many insightful 
e-mail exchanges and in particular for pointing out to us the key feature used in the proof of 
Theorem [5] relating topologically weak mixing with proximality. We are also extremely grateful 
to N. Hindman for his comments and suggestions on a preliminary version of this paper. The third 
author is partially supported by a grant from the Academy of Finland. 

2. Words and substitutions 

In this section we give a brief summary of some of the basic background in combinatorics on 
words. 

2.1. Words & subshifts. Given a finite non-empty set A (called the alphabet), we denote by A*, 
A N and A z respectively the set of finite words, the set of (right) infinite words, and the set of 
bi-infinite words over the alphabet A. Given a finite word u = a±a 2 . . . a n with n > 1 and a, £ A, 
we denote the length n of u by \u\. The empty word will be denoted by e and we set |e| = 0. We 
put A + = A* — {e}. For each a £ A, we let \u\ a denote the number of occurrences of the letter a 
in u. 

Given an infinite word cu £ A N , a word u £ A + is called & factor of to if u = tUitUi+i ■ ■ ■ C0i+ n for 
some natural numbers i and n. We denote by .F w (n) the set of all factors of cu of length n, and set 

A factor u of to is called right special if both ua and ub are factors of to for some pair of distinct 
letters a, b £ A. Similarly u is called left special if both au and bu are factors of u for some pair of 
distinct letters a,b E A. The factor u is called bispecial if it is both right special and left special. 
For each factor u £ T u set 

u\ u = {n e N I 0J n 0J n+1 . . . w n+ | u |_i = u}. 

We say u is recurrent if for every u £ the set u>\ is infinite. We say to is uniformly recurrent if 
for every u £ JF U the set u\ is syndedic, i.e., of bounded gap. 
We endow A N with the topology generated by the metric 

d(x, y) = — where n = inf{k : x k ^ y k } 

whenever x = (x n ) ra6 N and y = (y n )neN are two elements of ^4 N . Let T : A N — > A N denote the 
shift transformation defined by T : (x n ) n6 N H- (x n+ i) n g$. By a subshift on A we mean a pair 
(X, T) where X is a closed and T-invariant subset of A N . A subshift (X, T) is said to be minimal 
whenever X and the empty set are the only T-invariant closed subsets of X. To each to £ A N is 
associated the subshift (X, T) where X is the shift orbit closure of u. If u is uniformly recurrent, 
then the associated subshift (X, T) is minimal. Thus any two words x and y in X have exactly the 
same set of factors, i.e., JF X = T y . In this case we denote by T x the set of factors of any word 
x £ X. 
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Two points x, y in X are said to be proximal if and only if for each N > there exists n G N 
such that 

X n X n +l ■ ■ ■ %n+N — VnUn+l ■ ■ ■ Un+N- 

Two points x, y G X are said to be regionally proximal if for every prefix u of x and v of y, there 
exist points x', y' G X with x' beginning in u and y' beginning in v and with x' proximal to y' . 
Clearly if two points in X are proximal, then they are regionally proximal. A point x G X is 
called distal if the only point in X proximal to x is x itself. A minimal subshift (X, T) is said to 
be topologically mixing if for every any pair of factors u, v G JFx there exists a positive integer N 
such that for each n > N, there exists a block of the form uWv G Tx with \ W\ = n. A minimal 
subshift (X, T) is said to be topologically weak mixing if for every pair of factors u, v G the 
set 

{ri g N | uA n v nj x ^0} 

is thick, i.e., for every positive integer N, the set contains N consecutive positive integers. 

Many of the words and subshifts considered in this paper are generated by substitutions. A sub- 
stitution r on an alphabet A is a mapping r : A — > A + . The mapping r extends by concatenation 
to maps (also denoted r) A* A* and A n ->■ A n . 

Let r be a primitive substitution on A. A word u G is called a ykerf pozVzf of r if r(a>) = o>, 
and is called a periodic point if r m (a>) = a; for some m > 0. Although r may fail to have a 
fixed point, it has at least one periodic point. Associated to r is the topological dynamical system 
(X, T), where X is the shift orbit closure of a periodic point u of r. The primitivity of r implies 
that (X, T) is independent of the choice of periodic point and is minimal. 

2.2. Sturmian words & generalizations. Let lo G A^ and set 

p u (n) = Card(J r w (n)). 

The function p w : N — >• N is called the factor complexity function of cu. Given a minimal subshift 
(X, T) on A, we have ^(n) = f° r ai l G X and n G N. Thus we can define the factor 

complexity p(x,T) (n) of a minimal subshift (X, T) by 

P(x,T)(n) = p u (n) 

for any w G X. 

A word ijj G *4. N is periodic if there exists a positive integer p such that cu i+p = tOi for all 
indices i, and it is ultimately periodic if = for all sufficiently large i. An infinite word is 
aperiodic if it is not ultimately periodic. By a celebrated result due to Hedlund and Morse 11291 , a 
word is ultimately periodic if and only if its factor complexity is uniformly bounded. In particular, 
Pu{ri) < n for all n sufficiently large. Words whose factor complexity p u (n) — n + 1 for all 
n > are called Sturmian words. Thus, Sturmian words are those aperiodic words having the 
lowest complexity. Since p w (l) = 2, it follows that Sturmian words are binary words. The most 
extensively studied Sturmian word is the so-called Fibonacci word 

f = 01001010010010100101001001010010010100101001001010010 ■ • • 

fixed by the morphism i — ^ 01 and 1 i — >- 0. Let u G {0, 1} N be a Sturmian word, and let Vt denote 
the shift orbit closure of u. The condition p w (n) = n + 1 implies the existence of exactly one right 
special and one left special factor of each length. Clearly, given any two left special factors, one is 
necessarily a prefix of the other. It follows that £1 contains a unique word all of whose prefixes are 
left special factors of u. Such a word is called the characteristic word and denoted ui. It follows that 
both OcD, la) G Vt. It is readily verified that the Fibonacci word above is a characteristic Sturmian 
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word. A Sturmian word uj is called singular if T n {u) = Cj for some n > 1. Otherwise it is said to 
be nonsingular. 

Sturmian words admit various types of characterizations of geometric and combinatorial nature. 
We give two such characterizations which will be used in the paper: as irrational rotations on 
the unit circle and as mechanical words. In [29 1 Hedlund and Morse showed that each Sturmian 
word may be realized measure-theoretically by an irrational rotation on the circle. That is, every 
Sturmian word is obtained by coding the symbolic orbit of a point x on the circle (of circumference 
one) under a rotation R a by an irrational angle a, < a < 1, where the circle is partitioned into 
two complementary intervals, one of length a and the other of length 1 — a. And conversely 
each such coding gives rise to a Sturmian word. The quantity a is called the slope. Namely, the 
rotation by angle a is the mapping R a from [0, 1) (identified with the unit circle) to itself defined 
by R a (x) = {x + a}, where {x} = x — [x] is the fractional part of x. Considering a partition of 
[0, 1) into I = [0, 1 — a), h — [1 — a, 1), define a word 



s <x,p(n) 



if R%(p) = {p + na}e Jo, 
if I%(j>) = {p + na} eh 



One can also define I' = (0, 1 — a], I[ = (1 — a, 1], the corresponding word is denoted by s' . 
For a Sturmian word w of slope a its subshift f2 is given by f2 = {s ajP , s' a \p e [0, 1)}. 
A straightforward computation shows that 

s a ,p{n) = [a(n + 1) + p\ — [an + p\ , 

s a , P {n) = \a{n + 1) + p] - \an + p\ ; 

s aj p and s' are called the upper and lower mechanical words (of slope a) based at p. 

In [0Q| Arnoux and Rauzy introduced a class of uniformly recurrent (minimal) sequences to on 
a m-letter alphabet of complexity p u (n) = (m — l)n + 1 characterized by the following combi- 
natorial criterion known as the * condition: cu admits exactly one right special and one left special 
factor of each length. We call them Arnoux-Rauzy sequences. This condition distinguishes them 
from other sequences of complexity (rn — l)n + 1 such as those obtained by coding trajectories 
of m-interval exchange transformations. These words are generally regarded as natural combi- 
natorial generalizations of Sturmian words to higher alphabets. In particular, the Fibonacci word 
generalizes to the m-bonacci word fixed by the substitution 

a m : {0, 1, . . . , m - 1} -»• {0, 1, . . . , m - 1}* 

given by 



0(i + 1) for < % < m - 1 
for i = m — 1 



However, many of the dynamical and geometrical interpretations of Sturmian words do not 
extend to this new class of words (see |[T0l for example). 

In the subsequent sections we will consider partitions of N defined by words. Let u G *4. N , and 
let T denote the set of factors of uj. A finite subset X is called a T-prefix code if X C T and given 
any two distinct elements of X, neither one is a prefix of the other. A ^-prefix code is T -maximal 
if it is not properly contained in any other ^-prefix code. The simplest example of a ^-maximal 
prefix code is the set of all elements of T of some fixed length d. Each ^-maximal prefix code X 
defines a partition 



CENTRAL SETS GENERATED BY UNIFORMLY RECURRENT WORDS 



7 



uex 

If tu is a Sturmian word, then the corresponding partition is called a Sturmian partition. 

3. Ultrafilters, IP-sets and central sets 

3.1. Stone-Cech compactification. Many of our results rely on the algebraic/topological proper- 
ties of the Stone-Cech compactification of N, denoted (3N. We regard (3N as the set of all ultrafilters 
on N with the Stone topology. 

Recall that a set U of subsets of N is called an ultrafilter if the following conditions hold: 

• i U. 

• If A E U and A C B, then B E U. 

• An B EU whenever both A and B belong to U. 

• For every ACN either A E U or A c E U where A c denotes the complement of A. 

For every natural number nGN, the set U n = {A C N | n E A] is an example of an ultrafilter. 
This defines an injection i : N f3N by: n h-> W n . An ultrafilter of this form is said to be principal. 
By way of Zorn's lemma, one can show the existence of non-principal {ox free) ultrafilters. 

It is customary to denote elements of f3N by letters p,q,r For each set A C N, we set 

A° = {p E (3N\A E p}. Then the set B = {A°\A C N} forms a basis for the open sets (as well as 
a basis for the closed sets) of (3N and defines a topology on (3N with respect to which f3N is both 
compact and HausdorffQ 

There is a natural extension of the operation of addition + on N to (3N making f3N a compact left- 
topological semigroup. More precisely we define addition of two ultrafilters p, q by the following 
rule: 

p + q = {ACN\{nE N\A - n E p] E q}. 

It is readily verified that p + q is once again an ultrafilter and that for each fixed p E f3N, the 
mapping q h-> p + q defines a continuous map from /3N into itself The operation of addition in /3N 
is associative and for principal ultrafilters we have U m +U n = U m+n . However in general addition 
of ultrafilters is highly non-commutative. In fact it can be shown that the center is precisely the set 
of all principal ultrafilters [|24l . 

3.2. IP-sets and central sets. Let (5, +) be a semigroup. An element p E S is called an idempo- 
tent if p + p = p. We recall the following result of Ellis [fT8l : 

Theorem 3.1 (Ellis []T8l). Let (5, +) be a compact left-topological semigroup (i.e., Vx E S the 
mapping y i— )■ x + y is continuous). Then S contains an idempotent. 

It follows that (3N contains a non-principal ultrafilter p satisfying p + p = p. In fact, we could 
simply apply Ellis's result to the semigroup [3N — Uq. This would then exclude the only principal 

1 Although the existence of free ultrafilters requires Zorn's lemma, the cardinality of /3N is 2 2N from which it follows 
that f3N is not metrizable. 

2 Our definition of addition of ultrafilters is the same as that given in 0] but is the reverse of that given in ll24l in 
which A G p + q if and only if {n 6 N\A — n G q} G p}. In this case, (3N becomes a compact right-topological 
semigroup. 
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idempotent ultrafilter, namely Uq. From here on, by an idempotent ultrafilter in (3N we mean a free 
idempotent ultrafilter. 

We will make use of the following striking result due to Hindman linking IP-sets and idempo- 
tents in f3N : 

Theorem 3.2 (Theorem 5.12 in [12410 . A subset A C N is an IP-set if and only if A G pfor some 
idempotent p G f3N. 

It follows immediately that A is an IP*-set if and only if A G p for every idempotent p G f3N (see 
Theorem 2.15 in [|4]). We also note that the property of being an IP-set is partition regular. 

In GOl , Furstenberg introduced a special class of IP-sets, called central sets, having additional 
rich combinatorial properties. They were originally defined in terms of topological dynamics (see 
Definition 1 1.21) . As in the case of IP- sets, they may be alternatively defined in terms of belonging 
to a special class of free ultrafilters, called minimal idempotent^]. To define a minimal idempotent 
we must first review some basic properties concerning ideals in (3N. 

Let (S, +) be any semigroup. Recall that a subset X C S is called a right (resp. left) ideal if 
1+5 C I (resp. 5 + ICI), It is called a two sided ideal if it is both a left and right ideal. A 
right (resp. left) ideal X is called minimal if every right (resp. left) ideal J included in X coincides 
with X. 

Minimal right/left ideals do not necessarily exist e.g. the commutative semigroup (N, +) has no 
minimal right/left ideals (the ideals in N are all of the form X n = [n, +oo) = {m G N | m > n}.) 
However, every compact Hausdorff left-topological semigroup S (e.g., f3N) admits a smallest two 
sided ideal K(S) which is at the same time the union of all minimal right ideals of S and the union 
of all minimal left ideals of S (see for instance [24]). It is readily verified that the intersection of 
any minimal left ideal with any minimal right ideal is a group. In particular, there are idempotents 
in K(S). Such idempotents are called minimal and their elements are called central sets: 

Definition 3.3. An idempotent p is called a minimal idempotent ofS if it belongs to K(S). 

Definition 3.4. A subset AcNis called central if it is a member of some minimal idempotent in 
/3N. It is called a central* -set if it belongs to every minimal idempotent in /3N. 

The equivalence between definitions 11.21 and 13.41 is due to Bergelson and Hindman in Q. It 
follows from the above definition that every central set is an IP-set and that the property of being 
central is partition regular. Central sets are known to have substantial combinatorial structure. 
For example, any central set contains arbitrarily long arithmetic progressions, and solutions to 
all partition regular systems of homogeneous linear equations (see for example [0). Many of 
the rich properties of central sets are a consequence of the Central Sets Theorem first proved by 
Furstenberg in Proposition 8.21 in [1201 (see also [QTl [6] |25)). Furstenberg pointed out that as 
an immediate consequence of the Central Sets Theorem one has that whenever N is divided into 
finitely many classes, and a sequence (x n ) ne ^ is given, one of the classes must contain arbitrarily 
long arithmetic progressions whose increment belongs to {J2neF x n\F G Fin(N)}. 

3.3. Limits of ultrafilters. It is often convenient to think of an ultrafilter p as a {0, l}-valued, 
finitely additive probability measure on the power set of N. More precisely, for any subset ACN, 
we say A has p-measure 1, or is p-large if A G p. This notion of measure gives rise to a notion of 
convergence of sequences indexed by N which is the key tool in allowing us to apply ideas from 
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combinatorics on words to the framework of ultrafilters. However, from our point of view, it is 
more natural to define it alternatively as a mapping from words to words (see Remark 13.101) . Let 
A denote a non-empty finite set. Then each ultrafilter p E (3N naturally defines a mapping 

p*:A N -+ A N 

as follows: 

Definition 3.5. For each p E /3N and uj G A N , we define p*(uj) G A N by the condition: u & A* is 
a prefix ofp*(uj) u\ u G p. 

We note that if u, v G A*, oj\ , u\ G p and \v\ > \u\, then u is a prefix of v. In fact, if v' denotes 
the prefix of v of length \u\ then as uj\ v C oj\ ,, it follows that oj\ , G p and hence u = v'. Thus 
P*(uj) is well defined. 

We note that if uj, v G A N and if each prefix u of v is a factor of uj, then there exists an ultrafilter 
p G /3N such that p* (no) = v. In fact, the set 

C = I u is a prefix of z/} 

satisfies the finite intersection property, and hence by a routine argument involving Zorn's lemma 
it follows that there exists a p G /3N with C C p. 

It follows immediately from the definition of p*, Definition 13 .41 and Theorem 13 .21 that 

Lemma 3.6. The set to | is an IP-set ( resp. central set) if and only ifu is a prefix ofp* (to) for some 
idempotent (resp. minimal idempotent) p G /3N. 

Lemma 3.7. For each p G (3N, u e A N and u G A* we have 

p*(cu) | = {m G N | cj| -mGp} 
where ui\ —mis defined as the set of all n G N ^mc/z ?/ia? n + m G a; I . 

Proof Suppose m E p*(u)\ . Then by definition u occurs in position m in p* (u) . Let v denote the 
prefix of p*(oj) of length \v\ = m + |«|. Then, as u is a suffix of v we have cj| + m C tj| and 
hence u\ C u\ — m. But as v is a prefix of v*(uj) we have cj G p and hence w — m G p as 

It; — \u r i \ / \ v i \ u i 

required. 

Conversely, fix m G N such that uj\ u — m G p. Let Z be the set of all factors v of uj of length 
|u| = m + \u\ ending in u. Then 

uj\ — m C oj\ . 

It follows that there exists v G Z such that w G p. In other words, there exists v E Z such that u 
is a prefix of p* (uj) . It follows that u occurs in position m in p* (u) . □ 

Lemma 3.8. For p, q E (3N and uj E A n , we have (p + q)*(uj) = q*(p*(u)). In particular, if p is 
an idempotent, then p* (p* (uj)) = p*(uj). 

Proof. For each word u E A* we have that u is a prefix of (p + q)*(uj) if and only if 

oj\ E p + q <^=^ {m G N | ui\ — m E p] E q. 

On the other hand, u is a prefix of q*(p*(uj)) if and only if p*(u)\ u E q. The result now follows 
immediately from the preceding lemma. □ 
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Lemma 3.9. For each p G (3Nandu G A n we have p*(T(u)) = T(p*(u)) where T : A fi ^ A fi 
denotes the shift map. 

Proof. Assume u G A* is a prefix of p*(T(ui)). Then T(u) | G p. But 

r(w)| = I \u\ . 

v ' I u v_y I au 

It follows that there exists a E A such that ui\ G p. Thus aw is a prefix of and hence u is a 
prefix of T(p* (u)). □ 

Remark 3.10. It is readily verified that our definition of p* coincides with that of p-lim n . More 
precisely, given a sequence (x„) ne N in a topological space and an ultrafilter p G /5N, we write 
p-lim n x n — y if for every neighborhood of y one has {n \x n G L^} G p. In our case we have 
p*(tu) = p-hm n (T n (tu)) (see ||2"2~T0 . With this in mind, the preceding two lemmas are well known 
(see for instance fl8l|22)). However, our defining condition of p* in Definition 13 . 51 does not directly 
rely on the topology and so may be applied in other general settings. For instance, let Cl C A N be 
a subshift, and J\f = {n < n\ < n 2 < ■ ■ ■ } an infinite sequence of natural numbers. For each 
cu E Q we put 

X'k = {^n+no^n+m ■ ■ ■ ^n+n fc _i | n > 0} C A k . 

For each u G we define the set 

UjN \u = { U G N I U "+no U n+ni ■ ■ ■ W n+n fc -i = U ) ■ 

Then the sets ui^\ u with u G X-jf partition N. So, given p G /3N, for each k > 1 there exists a 
unique w G X^ with w^J G p. Moreover if v G and G p, then u is a prefix of v. 

So using the condition in Definition 13 .51 each infinite sequence N and ultrafilter p G /3N defines 
a mapping — > fi. Of particular interest is the case in which VL is a uniform set in the sense of T. 
Kamae and A/" is chosen such that u [Af] is a super-stationary set (see fl26ll27l ). 

Another situation in which the defining condition of Definition 13.51 applies is in the context of 
infinite permutations [fT9l . By an infinite permutation n we mean a linear ordering on N. Then for 
each finite permutation u of {1, 2, . . . , n} we say that u occurs in position m of ir if the restriction 
of 7r to {m, m + 1, . . . , m + n — 1} is equal to u. Thus we may define the set ix\ as the set of 
all m G N such that u occurs in position m in n, and again the sets ix\ u (over all permutations u 
of {1, 2, ... , n}) determine a partition of N. Hence each p G [3N defines a map from the set of all 
infinite permutations into itself. 

In what follows, we will make use of the following key result in It24l (see also Theorem 1 in (SI 
and Theorem 3.4 in [Q): 

Theorem 3.11 (Theorem 19.26 in fl24l ). Let (X, T) be a topological dynamical system. Then if two 
points x, y G X are proximal with y uniformly recurrent, then there exists a minimal idempotent 
p G 0N such thatp*(x) = y. 

As a consequence we have 

Theorem 3.12. Let cu G *4. N be a uniformly recurrent word, and let u G A + . Then oj\ is an IP-set 
if and only ifui is a central set. 
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Proof. For any A C N we have that if A is central then A belongs to some minimal idempotent 
p E (3N and hence in particular A belongs to an idempotent in (3N. Hence by Theorem 13.21 we 
have that A is an IP-set. Now suppose that ui\ is an IP-set. Then ui\ belongs to some idempotent 
p E (3N. Set v = p*(u). Then u is a prefix of v. Also, since p is idempotent we have p*{v) = 
p*(p*(ui)) = p*(u) = v. Hence for every prefix v of v we have that v\ Ep and ui\ E p and hence 
v\ fl ui\ E p. In particular v\ fl u\ ^ 0. Hence co and v are proximal. Since to is uniformly 
recurrent, it follows that v is also uniformly recurrent. Hence by Theorem 13.1 II there exists a 
minimal idempotent q with q*(co) = v. Hence u\ E g, whence uj\ is central. □ 

Remark 3.13. A special case of Theorem 13.1 II states that if x and y are uniformly recurrent infinite 
words, then x and y are proximal if and only if p* (x) = y for some idempotent ultrafilter p E (3N. 
In the case of binary words, we could consider the following alternative notion: We say that x and 
y are anti-proximal if the set {n E N | x n ^ y n } is thick. For example the two fixed points t and 
ti of the Thue-Morse morphism are anti-proximal. In [9], together with N. Hindman we show that 
for every prefix u of ti, the set to| is finite FS-big. We recall that A C N is finite FS-big if Wk 
there exists (xj)f =1 such that FS(xj)f =1 C A where 

FS(x t )f =1 = {5>,|FC{l,2,...,£;}}. 

As in the case of IP-sets, the property of being finite FS-big is partition regular, i.e., if A C N is 
finite FS-big and A = Ui=i -^i> men some is finite FS-big (see flU). In the context of binary 
words, the notions of proximality and anti-proximality are somewhat similar in the sense that in 
both cases the behavior of one word is strongly affected by the behavior of the other: In case x 
and y are proximal, then x does as y on a thick set while if x and y are anti-proximal, then x and y 
play opposites on a thick set. One might ask the question of finding an analogue of Theorem 13.1 II 
characterizing anti-proximality. 

4. A FIRST ANALYSIS OF SOME CONCRETE EXAMPLES 

4.1. The Fibonacci word. While most of the proofs of the results announced in the Introduction 
rely on the algebraic and topological properties of ultrafilters on N and their links to IP-sets, we 
begin by analyzing concretely a few examples generated by simple substitution rules. To establish 
that certain subsets of N are IP-sets, we will use nothing more than the definition of IP-sets and 
the abstract numeration systems defined by substitutions first introduced by J.-M. Dumont and A. 
Thomas Ifl5l[16t 

Let us begin with the Fibonacci infinite word f = /0/1/2 . . . G {0, 1} N given by 

f = 01001010010010100101001001010010010100101001001010010 ■ • ■ 

We set 

f| = {nGN|/„ = 0} 

and 

f| 1 = {nGN|/ n = l}. 

So f| Q = {0,2,3,5,7,8,10,11,13,15,16,...} and = {1,4,6,9,12,14,17,...}. This defines 
the Sturmian partition N = f | Q U f | r Let us denote by F n the nth Fibonacci number so that F = 

1, Fi — 2, F 2 = 3, It is well known that each positive integer n has one or more representations 

when expressed as a sum of distinct Fibonacci numbers, i.e., n = Y^i=o w ^ m e {0> 1} anc ^ 
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t k = 1. We call the associated {0, l}-word t k tk-i • • - to a representation of n. For example, for 
n = 50 we obtain the following 6 representations (arranged in decreasing lexicographic order): 

10100100 
10100011 
10011100 
10011011 

1111100 

1111011 

The lexicographically largest representation is obtained by applying the greedy algorithm. This 
gives rise to a representation of n of the form n = Yli=o ^i^i w ^ m ti+iU ^ 11 for each < i < 
k — 1. This representation of n is called the Zeckendorff representation ||30l (a special case of the 
Dumont-Thomas numeration system |[T5l [T6l ). We shall write Z{n) = t k t k ^i . . .t . It follows 
immediately that Z(F n ) = 10 n . The connection between Z{n) and the entry f n of the Fibonacci 
word f is given by the following well known fact: f n = whenever Z{n) ends in and f n = 1 
whenever Z{n) ends in 1. Thus 

f| o = {ne n\Z(n) ends inO} 

and 

f | = { n G N | Z(n) ends in 1}. 

We now consider the sequence (x n )neN given by x n = F 2n +i. It is readily verified that for each 
A G Fin(N), the Zeckendorff representation of J2neA x n enc ^ s m 10 2m+1 where m = min(A). 
In fact, the symbolic sum of the individual Zeckendorff representations of each x n occurring in 
5^ ne A x n does not involve any carry overs. Moreover the resulting expression does not contain 
any occurrences of 11 and hence is equal to the Zeckendorff representation of ^2 n&A x n . Thus 
every finite sum of the form ^ ngj4 x n with A G Fin(N) belongs to f | . Thus we have shown that 
f | is an IP-set. 

We next verify that is not an IP-set, and hence f| is an IP*-set. We will use the follow- 
ing general observation. Consider a subset A C N partitioned into k > non-intersecting sets: 
A = A\ U A 2 U • ■ • U A k . Suppose that for each 1 < j < k there exists a positive integer 
iV (which may depend on j) such that whenever mi,m 2 , . . . ,rriN are distinct elements of A,, 
we have YliLi m « ^ A. Then A is not an IP-set. In fact, if A were an IP-set, then for some 
1 < j < k, there would exist a sequence xi < x 2 < x 3 < • • • contained in Aj such that 
{J^neF x n\ F e Fin(N)} C A. 

Let a = 3 ~2 ■ Then the Fibonacci word f is the orbit of the point a under irrational rotation R a 
on the unit circle by a. Let / be the interval [1 — a, 1) (the interval coded by 1). So n G f | if and 
only if = {a + na] = {{n + I) a} G J. 

Fix 

(l-a)/3<a'< (l-«)/2 

and put 

h = [1 - a, 1 - a) and J 2 = [1 - a\ 1). 
Since a' < (1 — a)/2 it follows that a' < a. Also for j = 1,2 set 

^ = {n G N | R n (a) G J,}. 
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Thus Ai, A 2 partitions the set f | . We now show that f | is not an IP-set by showing that the sum 
of any three elements of A\ belongs to f | and that the sum of any two elements of A 2 belongs to 

f lo- 

Now take any ni, n 2 , n 3 G A\ and set 

x\ = {(n\ + l)a}, x 2 = {{n 2 + l)a}, x 3 = {(n 3 + l)a}. 
Then x±, x 2 , x 3 E [1 — a, 1 — a') and n\ + n 2 + n 3 corresponds to the point 

{(ni + n 2 + n 3 + l)a} = {21 + x 2 + x 3 - 2a}. 
Since x\, x 2 , x 3 G [1 — a, 1 — a'), we have 

{xi + X2 + £3 — 2a} G [{3 — 5a}, {3 — 3a' — 2a}). 



Since a' > i=p it follows that 










2 - 


- 3a' - 


- 2a < 1 — a, 


and hence 










{2- 


- 3a' - 


- 2a} < 1 - a, 


which gives 










{3- 


-3a' - 


- 2a} < 1 - a 


as required. 








Similarly take any n l5 n 2 G A 2 


Set 







xi = {(ni + l)a}, x 2 = {(n 2 + l)a} 
so that Xi, x 2 E [1 — a', 1). Then n-y + ra 2 corresponds to the point 

{(ni + n 2 + l)a} = {xi + x 2 - a}. 
Since x\, x 2 E [1 — a', 1), we have 

{xi + x 2 — a} G [{2 — 2a' — a}, 1 — a). 

Since 



it follows that 

{1 - 2a' - a} > 0, 

and hence 

{2 -2a' -a} > 0. 

The above arguments may be generalized to show that f is an IP*-set for every prefix u of f . 

In contrast, let us consider the sets g| and where g = Of = 001001010010010 .... Thus, 
g| = {n E N I g n = 0} = {0} U {n > 1 | f n _ x = 0}. 

Consider the sequence (y„) ne N defined by y n = F 2n+2 . It is readily verified that Z(y n — 1) = 
(10) n+1 and hence each y n belongs to g| . Now fix A E Fin(N). Since the Zeckendorff represen- 
tation of J2neA Vn en ^ s m 10 2m+2 where m = min(yl), it follows that Z(J2 n &A Vn ~ ^) en( ^ s m 
(10) m+1 , and hence J2 n€A y n G g| Q . Thus, g| is an IP-set. Similarly, it is readily verified that 
for each A E Fin(N), we have that J2 n eA x « e gL where x n = F 2n+ x. Thus this time we obtain 
the Sturmian decomposition N = g| „ U g| in which both sets g| and gL are IP-sets, and hence 
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central sets. In this case, neither g| nor g^ is an IP*-set. Once again, these arguments may be 
extended to show that both g| and g| lu are central sets for any prefix u of f and hence neither set 
is an IP* -set. 

In summary, by Theorem 13 .121 we have: 

Proposition 4.1. Let f denote the Fibonacci word. Then for every prefix u of i the set f | is an 
IP* -set (and hence a central* set). Setting g = Of we have that for every prefix uoff the sets g\ Qu 
and g| are both IP-sets (resp. central sets). 

4.2. The m-bonacci word. The above analysis extends more generally to the so-called m-bonacci 
word. Fix a positive integer m > 2, and let t = t tit 2 ... G {0, 1, . . . , m — 1} N denote the m- 
bonacci infinite word fixed by the substitution 

a m : {0, 1, . . . , m - 1} -)• {0, 1, . . . , m - 1}* 

given by 

n - / °^ + X ) for < « < m - 1 
am ^ ~ \ for % = m - 1 

Using the associated Dumont-Thomas numeration system, we will show: 
Proposition 4.2. Let m > 2, and consider the partition ofN given by 

N= U 8|* 

0<fc<m-l 

where g = Ot G {0, 1, . . . , m — 1} N . Then for each < k < m — 1 the set g| fc is an IP-set (resp. 
central set). 

The proof is a simple extension of the ideas outlined above in the case of the Fibonacci word. 
For each m > 2, we define the m-bonacci numbers by Tk = 2 k for < k < m — 1 and = 
T k _i + T k _ 2 + • • • + T k _ rn for k > m. When m — 2, these are the usual Fibonacci numbers. 
Each positive integer n may be written in one or more ways in the form n = Yli=i UTk-i where 
U G {0, 1} and t\ = 1. By applying the greedy algorithm, one obtains a representation of n 
of the form w = tit 2 ■ ■ ■ t k with the property that w does not contain m consecutive l's. Such 
a representation of n is necessarily unique and is called the m-Zeckendorff representation of n, 
denoted Z m (n) (see El). Thus Z m (T n ) = 10™ for n > 0. 

Proof Fix < k < m — 1. We will show that the set g is an IP-set. It is well known that t n = k 
if and only if Z m (n) ends in 01 fc . Hence 

g| fc = {n G N | g n = k} = {n G N | t n -i = k} = {n G N | Z m {n - 1) ends in01 fc }. 

Consider the sequence (x n ) ne ^ given by x n = T mn+k . It is readily verified for any finite subset 
icN, the m-Zeckendorff representation of the finite sum s = J2 n ^A x « en( ^ s m I0 mr+h where 
r = min(A) and hence the m-Zeckendorff representation of s — 1 ends in (r m_1 0) r l fc and hence 
s G g|, as required. 

Having established that each of the sets g| is a central set (for < k < m — 1), it follows that no 
g| fe is an IP* -set. 

□ 



CENTRAL SETS GENERATED BY UNIFORMLY RECURRENT WORDS 



15 



5. STURMIAN PARTITIONS & CENTRAL SETS 

We now study more generally partitions of N generated by Sturmian words and prove theorems |2] 
and [3] Throughout this section cu = cu cu 1 cu2 . . . G {0, 1} N will denote a Sturmian word, T the set 
of all factors of cu, and (fi, T) the subshift generated by cu, where T denotes the shift map. We 
denote by cu G Cl the characteristic word. 

Lemma 5.1. Ifcu, cu', cu" G Q are such thatT n °(cu) = T n °(cu') = T n °(cu"), then Card{cu, cu' , cu"} < 
2. 

Proof. This follows immediately from the fact that f2 contains a unique characteristic word and 
that this word is aperiodic. □ 

We will make use of the following key lemma which essentially says that two distinct Sturmian 
words cu and cu' are proximal if and only if T n {cu) = T n {cu') = cu for some n > 1. 

Lemma 5.2. Let cu and cu 1 be distinct elements ofQ. Then either T n (cu) = T n (cu') = cu for some 
n > 1, or there exists N > such that cu n u n+ i . . . cu n+ N ^ ou' n cu' n+1 . . . cu' n+N for every n G N. 

Proof. We will use a definition of Sturmian words via rotations, which we recalled in Section 2. 
Notice that cu = s a>a = s' a a , and singular words correspond to the case when the orbit of a point 
under rotation map goes through the point a. If s a>p is non-singular, then s a p = s' . Ifw^w' are 
singular words defined by rotations of the same point, i. e., w = s ajP , w' = s' , then they differ 
only when they pass through 1 — a and 0, i. e., in maximum two points, so there exists n > 1 
such that T no (cu) = T n °(cu') = cu. 

Now consider the case when w, w' are defined by rotations of two different points p, p', < p < 
p' < 1. To be definite, let us consider the interval exchange of 7 and I\ for both w and w'. We 
should prove that there there exists N > such that 

for every n G N. We have W{ ^ w\ if and only if W{ G Iq, w\ G I\ or Wi G h, w[ G Iq. This 
condition is equivalent to 

Wi G [1 - a - (p' - p), 1 - a) U [1 - (// - p), 1). 

The distribution of points from the orbit of any point 9 under rotation by a is dense, it means 
that for every e there exists N(e), such that after N(e) iterations points split the interval [0, 1) into 
intervals of length less than e. Putting e = p' — p, we get that every N = N(e) consecutive iterations 
there will be a point in every interval of length p' — p, so there are points in [1 — a — (p' — p) , 1 — a) 
and [1 — (p' — p), 1) every N iterations, and hence for every n there exists i G [n, n + N — 1] with 
Wi ^ w[. 

□ 

We first consider the case of nonsingular Sturmian words: 

Lemma 5.3. Let cu G {0, 1} N be a nonsingular Sturmian word andp G f3N an idempotent ultrafil- 
ter. Then p*(cu) = cu. 

Proof. Suppose to the contrary that p*(cu) ^ cu. Then since cu is nonsingular, Lemma 15^21 implies 
that for all sufficiently long factors u of u, we have that co\ H p*(cu)\ u = 0. But, by Lemma [3781 
we have p*(p*(cu)) = p*(cu), that is the image under p* of ou and p*(cu) coincides. It follows by 
definition of p* that for every prefix u of p*(cu) we have cu\ u G p and p*{cu)\ u G p and hence 
cu\ r\p*(cu)\ G p, a contradiction. □ 
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Proof of Theorem^ Let uj be a nonsingular Sturmian word, u a prefix of uj, and p E f3N an 
idempotent ultrafilter. Then by Lemma 1531 m is a prefix of p*(uj) and hence ui\ u E p. Thus for each 
prefix u of uj the set u\ belongs to every idempotent ultrafilter and hence is an IP*-set. It follows 
that if v E F is not a prefix of uj, then uj\ v is not an IP-set. Finally, let v be any factor of uj and 
n e N. Then u\ — n = T n \uj) | . If n G w| , then w is a prefix of T n (w) from which it follows that 

u)\ -n = T n (uj)\ Ep. 

Hence u; — n is an IP*-set □ 

I V 

As a consequence of the above theorem we have 

Corollary 5.4. Let uj and uj' be two nonsingular Sturmian words, not necessarily of the same slope. 
Then for every prefix u of uj and every prefix u' of to' we have that u\ R ui'\ , is an IP* -set (resp. 
central* set), in particular the intersection is infinite. 

We note that the assumption that uj and uj' be nonsingular is necessary, as for example if we 
consider uj = Of and uj' — If with f the Fibonacci word, then uj\ q D u'\ x = {0}. 

Proof. Let to and to' be two nonsingular Sturmian words, u a prefix of w, a' a prefix of u/, and 
p E f3N an idempotent ultrafilter. Then by Corollary ?? we have that oj\ E p and to\ , E p and 
hence w| D u>\ , Ep. Thus u\ D a;| , belongs to every idempotent and hence is an IP*-set. □ 

We next consider singular Sturmian words. 

Lemma 5.5. Let u,u)' E Q be distinct Sturmian words such that T n °(u>) = T n °(u/) = Co for some 
tiq>1. Then for every u E T and every non-principal ultrafilter p E (3N we have 

co\ E p <^=^ co'\ E p. 

In particular, p*(u>) = p*(u>'). 

Proof. Since p is a non-principal ultrafilter, we have that u\ E p u\ fl [N, +oo) G p for all 
N > 1. Similarly u/| E p <^=^ o;'| fl [N, +oo) G p for all N > 1. But for each u G J 7 , we have 
fl [no, +oo) =w'| n[no, +oo). The result now follows. □ 

Lemma 5.6. Le? cu,a/ G Q be as in the previous lemma, and let p E f3N be an idempotent 
ultrafilter. Thenp*{uS) =p*{uj') E {u,u'}. 

Proof. That p* (u) = p* (u') follows from the previous lemma and the fact that idempotent ultra- 
filters are non-principal (see for instance [4]). By Lemma [3791 p* commutes with the shift map T, 
and hence 

T no p*(u) = p*(T no u) = p*(Q) = u 

where the last equality follows from Lemma 15731 By Lemma I57fl applied to uj" = p*(co) it follows 
that p* (u) = uj or p* (oj) = oj' . □ 

Proof of Theorem\3\ Let u E Q, and n be as in the statement of the theorem. Then there exists a 
unique uj' E Vt with uj' ^ uj and with T n °{uj') = uj. Suppose that uj\ is an IP-set for some u E J- '. 
Then by Lemma [3761 it follows that u is a prefix of p*(co) for some idempotent ultrafilter p E (3N. 
It follows from Lemma [531 that u is a prefix of uj or a prefix of uj' . This proves one direction. 

To establish the other direction, we must show that uj | is a central set for each prefix u of uj or of 
uj' . By Theorem |3.111 there exist minimal idempotent ultrafilters p\,p2 G f3N such that pI(uj) = uj 
and p* 2 {uj) = uj' ■ The result now follows. □ 
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Remark 5.7. V. Bergelson suggested to us that the above result may be related to a previously 
known partition of N into two central sets X = {[mx],m G N} and Y = {[my],m G N}, where 
x and y are two irrational numbers satisfying 1/x + 1/y = 1. In fact, this partition precisely 
corresponds to our partition of N into two IP-sets u\ Q and cu\ 1 where cu is of the form Ou and Co is 
a characteristic Sturmian. 

This could be seen using the definition of Sturmian words via mechanical words (see Section 2 
for notation). For a slope a we have s a> o = Ou). Let a — 1/x and 1/y = 1 — a; then s ai0 (n) = 1 if 
and only if there exists an integer k such that a(n + 1) > k and an < k. It is easy to see that this 
pair of equations is equivalent to n < kx < n + 1, which implies n G X. We have s a> o(n) = if 
and only if there exists an integer k such that a(n + 1) < k + 1 and an > k. It is not difficult to 
see that this pair of equations is equivalent to n < (n — k)y < n + 1, which implies n G Y. 

Remark 5.8. We do not know if the above results on Sturmian partitions extend to the broader class 
of Arnoux-Rauzy words. In fact, our proof of Lemma 15^21 relies on the geometric interpretation 
of Sturmian words as codings of orbits under an irrational rotation on the circle. It was shown 
in ifTOl that there exist Arnoux-Rauzy words which are not measure-theoretically conjugate to a 
rotation on the n-torus. In this case, we do not understand which pairs of Arnoux-Rauzy words in 
the subshift are proximal. 

6. Proofs of Theorems [4] & [5] 

We begin by briefly reviewing some notions from topological dynamics. By a topological flow 
we mean a pair (X, f) consisting of a compact set X together with a homeomorphism / of X. In 
our framework we will consider X to be a set consisting of bi-infinite words on a finite alphabet 
and / the shift map. A topological flow (X, f) is said to be equicontinuous if for every e > 0, 
there exists a 5 > 0, such that for all x,y G X, if d(x, y) < 5 then d(f n (x), f n (y)) < e for every 
n G Z. A topological flow (Y, g) is called & factor of (X, f) if there exists a continuous surjection 

vr : X Y 

such that nof = go%. It is well known (for instance by way of Zorn's lemma) that every topological 
flow (X, f) has a maximal equicontinuous factor (Y, g) i.e., (Y, g) is an equicontinuous factor of 
(X, f) and any equicontinuous factor (Z, h) of (X, f) is also a factor of (Y, g). It is also well 
known that if n : X — > Y is the maximal equicontinuous factor, then for any two points x,y G X 
we have that n(x) = 7i(y) if and only if x and y are regionally proximal (see [|2) ). 

Proof of Theorem^ Let us fix positive integers r and N. Consider the constant length substitution 

r:{l,2,...,r}^{l,2,...,r} + 

given by 1 (->■ 123 ■ • • r, 2 i— > 23 • ■ ■ rl, 3 (->■ 34 ■ ■ • rl2, . . . , r \-t rl2 ■ ■ ■ r — 1. In case r = 2 
we have the Thue-Morse substitution on the alphabet {1, 2}. For 1 < % < r, let x^ denote the ith 
fixed point of r beginning in the letter i. As in the case of Thue-Morse, for i ^ j the words 
and x^ never coincide, i.e., Xn ^ Xn for each n G N. Let (X, T) denote the one-sided minimal 
subshift generated by the primitive substitution r. We will now show that each of the fixed points 
x^ is distal. 

Lemma 6.1. Let x denote any one of the fixed points x^ 1 ' of the substitution r above. Then x is 
distal. In particular, the two fixed points of the Thue-Morse substitution are each distal. 
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Proof. Let (X,T) denote the two-sided subshift generated by r, and let n : X — > Y denote the 
maximal equicontinuous factor. The substitution r above is of Pisot type, in fact, the dilation of r 
is r and all other eigenvalues are equal to 0. (Note that r is not an irreducible substitution). It is 
proved in [3 1 that, for a primitive substitution of Pisot type (irreducible or not), the mapping onto 
the maximal equicontinuous factor is finite to oneQ Thus there exists a constant C such that for 
any z E X, there are at most C points z' E X which are regionally proximal to z In particular, for 
any z E X, there are at most C points z' E X which are proximal to z. 

Now suppose y E X is proximal to x. We will show that y = x. It is easy to see that the 
bi-infinite word z — 3/fev ■ x E X where xrev denotes the reversal or mirror image of x, and 
where ■ denotes the origin. Similarly, let y' denote a left infinite word such that the concatenation 
z' = y' ■ y E X. Since x and y arc proximal, it follows that z and z' are proximal. Set o = r r . 
Since r, and hence a, are of constant length, it follows that cr(z') is proximal to <j(z). But a(z) = z. 
Hence (cr n (z')) n > defines an infinite sequence of points in X each of which is proximal to z, and 
which in the limit tends to x r g V ■ where i is the first (meaning rightmost) letter of y' and j is the 
first letter of y. But since there are only finitely many points in X which are proximal to z it follows 
that a n (z') = x r g V • x^ for some n > 0. Hence by de-substituting we obtain z' = x r g V • x^ from 
which it follows that y = x^' . Thus both x and y are fixed points of r which are assumed proximal. 
It follows that y = x and hence x is distal as required. □ 

Put x = x^ l \ Since x is distal, so is T n (x) for each n > 1. On the other hand, it is easy 
to see that for each positive integer n we have u^[n]x E X, where uW[n] denotes the reversal 
of the prefix of x^ of length n. Thus the r words {u^ [n]x, [n]x, . . . , [n]x} are pairwise 
proximal and each begin in distinct letters (this is because the fixed points never coincide). Finally 
let co = vS 1 ' [N + l]x, and set A4 = to | . for each 1 < i < r. Then each A4 is a central set. For each 
1 < n < N, we have that A, — n = T n (cu) | = [N + 1 — n]x\. is a central set. But for k > 1, 
we have that A, ; — (N + k) = T k ~ l (x) | . which is a central set if and only if T k ~ l (x) begins in i. 

□ 

Proof of Theorem\5\ Fix a positive integer r. Let r be a primitive substitution whose associated 
subshift Vt is topologically weak mixing. For instance we may take the substitution H- 001 and 
1 11001 orO ^ 001 and 1 ^ 11100 (see [HI). Let co E Q. Fix m such that p u (m) > r, 
and put s = p^ra). Let Ui, w 2 , • • • , u s denote the factors of co of length m. As pointed out to us 
by V. Bergelson and Y. Son [7], the weak mixing implies that the set of points in Vt proximal to 
co is dense in Vt (see for instance page 184 of Il20l0 . Thus for each factor Ui there exists a word 
Xi EVL beginning in U{ and which is proximal to co. Hence by Theorem 13.111 there exists a minimal 
idempotent ultrafilter pi E [3N such that p*(co) = X{. Hence for each 1 < i < s we have that 
ui I E Pi and hence to | is a central set. Finally, for each positive integer n and for each 1 < i < s 
we have that 

u>\ -n = T n (co)\ . 

Again the weak mixing implies that there exists a word x G O beginning in ui and proximal to 
T n {co). Hence there exists a minimal idempotent p E f3N such that p*(T n (co)) = x from which it 



The authors study the maximal equicontinuous factor of 1 -dimensional substitutive real tiling spaces. To apply 
their finiteness result (Theorem 4.2 in ||3]), we use the fact that in our setting all the tiles have the same length, and 
hence proximality of points in X with respect to the shift map T implies proximality of the corresponding tilings under 
the K— action. 
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follows that oj\ — n E p and hence oj\ — n is a central set. Thus we obtain a partition of N 



N = (j, 



\lli 

i=l 

into s-many central sets and for each positive integer n and 1 < i < s we have that oj\ — n is 
again a central set. Thus, setting 

A = oj\ 



for % — 1 , . . . , r — 1 , and 



4-= u 



i=r— 1 

we obtain the desired partition of N. □ 

7. Infinite central partitions of N 

In this section we construct infinite partitions of N into central sets by using words on an infinite 
alphabet. Our construction makes use of the notion of iterated palindromic closure operator (first 
introduced in f[T4ll ): 

Definition 7.1. The iterated palindromic operator ip is defined inductively as follows: 

• ip(e) = e, 

• For any word w and any letter a, ip(wa) = (ij;(w)aY + '. 

We denote with the right palindromic closure of the word w, i.e., the shortest palindrome 
which has w as a prefix. 

For example, ^p(aaba) = aabaaabaa. The operator ij) has been extensively studied for its central 
role in constructing standard Sturmian and episturmian words. It follows immediately from the 
definition that if u is a prefix of v, then il>(u) is a prefix of ip(v). Thus, given an infinite word 
oj = oj ojiOJ 2 ... on the alphabet A we can define 

vp(oj) = lim i/j(u uiu 2 ■ ■ - oj n ). 

n— >oo 

The following lemma summarizes the properties of ip needed. 

Lemma 7.2. Let A be a right infinite word over the (finite or infinite) alphabet A and let oj = ip(A). 
Then the following statements hold: 

(1) The word oj is closed under reversal, i.e., if v = v\v 2 . . . is a factor of oj, then so is its 
mirror image v k ■ ■ ■ v 2 V\. 

(2) The word oj is uniformly recurrent. 

(3) If each letter a £ A appears in A an infinite number of times, then for each prefix u of oj 
and each a<EA,we have au is a factor ofoj. 

Proof. Since any factor of oj is contained in some ip(v) for a sufficiently long prefix v of A, and 
*jj(v) is by definition a palindrome (and hence closed under reversal), the first statement is proved. 
The second statement is easily derived from the fact that for any finite prefix va of A (a being a 
letter), we have that \ip(va) \ < 2\ip(v) \ + 1 and moreover ip{va) begins and ends in tp(v ). It follows 
that any factor of length (for example) 3\ip(v) \ contains an occurrence of ip(v). 

Finally suppose each a £ A appears infinitely many times in A. Thus for any letter a and any 
prefix v of A there exists a prefix of A of the form vv'a. From the definition of ip we then have 
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that ip{yv')a is a prefix of oj and ip{yv') ends in ij>(v), so ip{y)a is a factor of uj. Since i/j(v) is a 
palindrome and u; is closed under reversal, we obtain that for any prefix v of A and for any letter 
a, the word atp(v) is a factor of uj and the third statement easily follows. □ 

With the preceding Lemma, we are now able to construct infinite partitions of N such that each 
element of the partition is an IP-set. 

Proposition 7.3. Let uj = ip(A) where A is a right infinite word on an infinite alphabet A with the 
property that each letter a £ A occurs in A an infinite number of times. Then, for any a G A, the 
set au\ a is a central set, thus {oj\ a + l}ae.4 is an infinite partition ofN — {0} into central set^ 

Proof. From 1731 we have that uj is uniformly recurrent and closed under reversal. Furthermore, 
since each a 6 A occurs in A an infinite number of times, @ of |7.2| implies that the set of factors 
of au coincides with that of uj. It follows therefore that auj is uniformly recurrent as well. Let us 
denote by ir a the image of uj under the morphism p, a defined as follows: 

• n a (a) = 0, 

• Ha(x) = 1 if x 7^ a. 

Since au is uniformly recurrent for any a, it is clear that also 0n a is uniformly recurrent for any 
a. From Theorem 13.1 11 we then have that for any a there exists a minimal idempotent ultrafilter 
p a such that p*(07r a ) = 07r a . In particular, this means, by Lemma 1331 that 07r a | (which clearly 
coincides with auj\ a by definition) is a central set for any a. The statement can then be easily 
derived from the fact that acu I — 1 = cu I . □ 

la la 
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